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Abstract 

A general-purpose method to mechanically transform 
system requirements into a provably equivalent model has 
yet to appear. Such a method represents a necessary step 
toward high-dependability system engineering for numer- 
ous possible application domains , including sensor net- 
works and autonomous systems. Currently available tools 
and methods that start with a formal model of a system and < 
mechanically produce a provably equivalent implementa- 
tion are valuable but not sufficient The “gap ” unfilled by 
such tools and methods is that their formal models cannot 
be proven to be equivalent to the system requirements as 
originated by the customer For the classes of systems whose 
behavior can be described as a finite (but significant) set of 
scenarios , we offer a method for mechanically transform- 
ing requirements ( expressed in restricted natural language , 
or in other appropriate graphical notations) into a prov- 
ably equivalent formal model that can be used as the basis 
for code generation and other transformations. 

Key Words: Validation, verification, formal methods, auto- 
matic code generation, sensor networks 

1. Introduction 

Sensor networks and other highly distributed au- 
tonomous systems cannot attain high dependability with- 
out addressing software dependability issues. Develop- 
ment of a system that will have a high level of reliability 
requires the developer to represent the system as a for- 
mal model that can be proven to be correct. Through the 
use of currently available tools, the model can then be au- 
tomatically transformed into code with minimal or no 
human intervention to reduce the chance of inadvertent in- 
sertion of errors by developers. Automatically producing 
the formal model from customer requirements would fur- 


ther reduce the chance of insertion of errors by develop- 
ers. 

The need for ultra-high dependability systems increases 
continually, along with a correspondingly increasing need 
to ensure correctness in system development. By “ correct- 
ness”, we mean that the implemented system is equivalent 
to the requirements, and that this equivalence can be proved 
mathematically. 

Available system development tools and methods that 
are based on formal models provide neither automated gen- 
eration of the models from requirements nor automated 
proof of correctness of the models. Therefore, today there 
is no automated means to produce a system or a procedure 
that is a provably correct implementation of the customer's 
requirements. Further, requirements engineering as a disci- 
pline has yet to produce an automated, mathematics-based 
process for requirements validation [14]. 

2. Problem Statement 

Automatic code generation from requirements has been 
the ultimate objective of software engineering almost since 
the advent of high-level programming languages, and calls 
for a “requirements-based programming” capability have 
become deafening [8]. Several tools and products east in 
the marketplace for automatic code generation from agiven 
model. However, they typically generate code, portiais of 
which are never executed, or portions of which cannot be 
justified from either the requirements or the model. More- 
over, existing tools do not and cannot overcome the funda- 
mental inadequacy of all currently available automated de- 
velopment approaches, which is that they include no neans 
to establish a provable equivalence between the refiire- 
ments stated at the outset and either the model or thee ode 
they generate. 

Traditional approaches to automatic code generatio] pre- 
suppose the existence of an explicit (formal) model oireal- 
ity that can be used as the basis for subsequent codegen- 




(a) traditional development process 



(b) reverse engineering process 


Figure 1. (a) traditional software development 
process from requirements to code, and (b) reverse 
engineering from code to a system description. 


eration (see Figure 1 (a)). While such an approach is rea- 
sonable, the advantages and disadvantages of the various 
modeling approaches used in computing are well known 
and certain models can serve well to highlight certain is- 
sues while suppressing other less relevant details [18]. It is 
clear that the converse is also true. Certain models of real- 
ity, while successfully detailing many of the issues of inter- 
est to developers, can fail to capture some important issues, 
or perhaps even the most important issues. Existing reverse- 
engineering approaches suffer from a similar plight. In typ- 
ical approaches, such as the one illustrated in Figure 1 (b), a 
model is extracted from an existing system and is then rep- 
resented in various ways, for example as a digraph. The re- 
engineering process then involves using the resulting repre- 
sentation as the basis for code generation, as above [14]. 

2.1. Specifications, Models, and Designs 

The model on which automatic code generation is 
based is referred to as a design, or more correctly, a de- 
sign specification. There is typically a mismatch be- 
tween the design and the implementation (sometimes 
termed the “specification-implementation gap”) in that 
the process of going from a suitable design to an imple- 
mentation involves many practical decisions that must 
be made by the automated tool used for code genera- 
tion without any clear-cut justifications, other than the 
predetermined implementation decisions of the tool de- 
signers. There is a more problematic “gap”, termed the 
“analysis-specification gap”, that emphasizes the prob- 
lem of capturing requirements and adequately representing 



Figure 2. The R2D2C approach, generating a for- 
mal model from requirements and producing code 
from the formal model, with automatic reverse en- 
gineering. 


them in a specification that is clear, concise, and com- 
plete. This specification must be formal, or proof of 
correctness is impossible [1]. Unfortunately, there is reluc- 
tance by many to embrace formal specification techniques, 
believing them to be difficult to use and apply [6] [2], de- 
spite many industrial success stories [1 1] [12] [14]. 

Our experience at NASA Goddard Space Flight Center 
(GSFC) has been that while engineers are happy to write 
descriptions as natural language scenarios, they are loath to 
undertake formal specification. Absent a formal specifica- 
tion of the system under consideration, there is no possibil- 
ity of determining any level of confidence in the correctness 
of an implementation [14]. 

2.2. A Novel Approach 

Our approach involves providing a mathematically 
tractable round-trip engineering approach to system devel- 
opment. The approach described herein is provisionally 
named R2D2C (“Requirements to Design to Code”) [14]. 

In this approach, engineers (or others) may write specifi- 
cations as scenarios in constrained (domain-specific) natu- 
ral language, or in a range of other notations (including Uni- 
fied Modeling Language (UML) use cases). These will be 
used to derive a formal model (Figure 2) that is guaranteed 
to be equivalent to the requirements stated at the outset, and 
which will subsequently be used as a basis for code genera- 
tion. The formal model can be expressed using a variety of 
formal methods. Currently we are using CSP, Hoare’s lan- 
guage of Communicating Sequential Processes [15] [16], 
which is suitable for various types of analysis and investi- 
gation, and as the basis for fully formal implementations as 
well as for use in automated test case generation, etc. [14]. 

3. Technical Approach 

R2D2C is unique in that it allows for full formal devel- 
opment from the outset, and maintains mathematical sound- 
ness through all phases of the development process, from re- 
quirements through to automatic code generation. The ap- 




proach may also be used for reverse engineering, that is, in 
retrieving models and formal specifications from existing 
code, as shown in Figure 2. The approach can also be used 
to “paraphrase” (in natural language, etc.) formal descrip- 
tions of existing systems. In addition, the approach is not 
limited to generating high-level code. It may also be used to 
generate business processes and procedures, and we are cur- 
rently experimenting with using it to generate instructions 
for robotic devices to be used on the Hubble Robotic Ser- 
vicing Mission (HRSM). We are also experimenting with 
using it as a basis for an expert system verification tool, and 
as a means of capturing expert knowledge for expert sys- 
tems. 

Section 3.1 describes the approach at a relatively high 
level. Section 3.2 describes an intermediate version of the 
approach for which we have built a prototype tool [19], and 
with which we have successfully undertaken some exam- 
ples. 

3-1- R2D2C 

The R2D2C approach involves a number of phases, 
which are reflected in the system architecture described in 
Figure 3. The following describes each of these phases. 

D1 Scenarios Capture: Engineers, end users, and others 
write scenarios describing intended system operation. 
The input scenarios may be represented in a con- 
strained natural language using a syntax-directed ed- 
itor, or may be represented in other textual or graphi- 
cal forms. 

D2 Traces Generation: Traces and sequences of atomic 
events are derived from the scenarios defined in Dl. 

D3 Model Inference: A formal model, or formal specifi- 
cation, expressed in CSP is inferred by an automatic 
theorem prover - in this case, ACL2 [17] - using the 
traces derived in phase 2. A deep 1 embedding of the 
laws of concurrency [13] in the theorem prover gives 
it sufficient knowledge of concurrency and of CSP to 
perform the inference. The embedding will be the topic 
of a future paper. 

D4 Analysis: Based on the formal model, various analy- 
ses can be performed, using currently available com- 
mercial or public domain tools, and specialized tools 
that are planned for development. Because of the na- 
ture of CSP, the model may be analyzed at different 
levels of abstraction using a variety of possible imple- 
mentation environments. This will be the subject of a 
future paper. 



Figure 3. The entire process with Dl thru D5 
illustrating the development approach and 
R1 thru R4 the reverse engineering. 


D5 Code Generation: The techniques of automatic code 
generation from a suitable model are reasonably well 
understood. The present modeling approach is suitable 
for the application of existing code generation tech- 
niques, whether using a tool specifically developed for 
the purpose, or existing tools such as FDR [5], or con- 
verting to other notations suitable for code generation 
(e.g., converting CSP to B [3]) and then using the code 
generating capabilities of the B Toolkit. 



Figure 4. Reverse engineering of system using 
R2D2C. 


1 “Deep” in the sense that the embedding is semantic rather than merely 
syntactic. 


It should be re-emphasized that the “code” that is gen- 
erated may be code in a high-level programming lan- 

















guage, low-level instructions for (electro-) mechanical 
devices, natural-language business procedures and instruc- 
tions, or the like. As Figure 4 illustrates, the above process 
may also be run in reverse: 

R1 Model Extraction : Using various reverse engineering 
techniques, a formal model expressed in CSP may be 
extracted. 

R2 Traces Generation : The theorem prover may be used 
to automatically generate traces based on the laws of 
concurrency and the embedded knowledge of CSP. 

R3 Analysis : Traces may be analyzed, used to check for 
various conditions, undesirable situations arising, etc. 

R4 Paraphrasing : A description of the system (or system 
components) may be retrieved in the desired format 
(natural language scenarios, UML use cases, etc.). 

Paraphrasing, whereby more understandable descrip- 
tions (above and beyond existing documentation) of ex- 
isting systems or system components are extracted, is 
likely to have useful application in future system mainte- 
nance for systems whose original design documents have 
been lost or systems that have been modified so much that 
the original design and requirements document do not re- 
flect the current system. 

3.2. Short-cut R2D2C 

The approach described in Section 3.1 is the way that 
R2D2C is intended to be applied, from requirements speci- 
fication through to code generation. However, the approach 
requires significant computing power in the form of an au- 
tomated theorem prover performing significant inferences 
based on traces input and its “knowledge” of the laws of 
concurrency. While this is well warranted for certain appli- 
cations, it is likely to be beyond the resources of many de- 
velopers and organizations. As a practical concession, we 
also define a reduced version of R2D2C called the short- 
cut version (Figure 5), whereby the use of a theorem prover 
is avoided, yet without sacrificing high confidence in the va- 
lidity of the approach. The following describes each of the 
phases for the shortcut R2D2C: 

51 Scenarios Capture : As before, intended system behav- 
ior is described by scenarios input in natural language, 
or an appropriate graphical or semi-formal notation. 

52 Translation to Intermediate Notation : Scenarios are 
translated to an intermediate notation, termed EzyCSP, 
which is a simple natural language-like subset of CSP 
that can be used to describe a large number of situa- 
tions and scenarios (recall that scenarios are domain 
specific). 

53 Analysis : While far more simple than CSP, EzyCSP al- 
lows some simple analyses to be performed. 


S4 Implementation in Java : EzyCSP is sufficiently simple 
that it may easily be translated to Java and executed. 

This simplified or short-cut approach clearly has significant 
disadvantages when compared to our full approach. Firstly, 
the correctness of the development process is contingent 
on the correctness of both the translation of scenarios to 
the intermediate (EzyCSP) notation and the translation of 
EzyCSP to Java. However, the correctness of the translators 
for these is assured via a proof of correctness undertaken 
with the ACL2 theorem prover. Secondly, we do not have a 
reverse process, suitable to support reverse and (ultimately) 
re-engineering, for free. However, a Java-to-EzyCSP trans- 
lator would certainly be possible for highly constrained sub- 
sets of Java. 

The significant advantage of this simplified approach, 
however, is that although a proof of correctness involving 
a theorem prover is still required, this is required exactly 
once and would be performed by the support system devel- 
opers (presumably expert in the art). This is significantly 
less expensive computationally than using a theorem prover 
in the development of each individual application. 



Figure 5. Short cut R2D2C. 


4. A Simple Example 

The Lights-Out Ground Operating System (LOGOS) is 
a proof-of-concept NASA system for automatic control of 
ground stations when satellites pass overhead and under 
their control. The system exhibits both autonomous and au- 
tonomic properties [?] [22], and operates by having a com- 
munity of distributed autonomous software modules work 
cooperatively to perform the functions previously under- 
taken by human operators using traditional software tools, 
such as orbit generators and command sequence planners. 
A post-implementation formal specification of the system 









was undertaken in CSP [20] [10]. Using CSP, a number of 
anomalies, conflicts, and omissions in the system were dis- 
covered that had not been detected in testing and/or actual 
execution. This experience is typical of highly distributed 
systems, such as sensor networks or other multi-agent based 
systems where dependability is both very important and 
very difficult to evaluate. The same approach can be used 
for space based WSN systems where a control station is 
in charge of several WSNs located on spacecrafts in deep 
space. An example is the Autonomous Nano Technology 
Swarm mission (ANTS) [4], which is at the concept devel- 
opment phase. This mission will send 1,000 pico-class (ap- 
proximately 1 kg) spacecraft to explore the asteroid belt. 
The ANTS spacecraft will act as a sensor network making 
observations of asteroids and analyzing their composition. 

4.1. Specification of LOGOS 

We will not consider the entire LOGOS system here. Al- 
though a relatively small system, it is too extensive to il- 
lustrate in its entirety in this paper. Instead, we will take 
an example agent from the system, and illustrate its map- 
ping from natural language descriptions through to simple 
Java implementation. 

Let us first illustrate, via a trivial example, how scenarios 
map to CSP. Suppose we have the following as part of one 
of the scenarios for the system: 

if the WSN Monitoring Agent receives a fault ad- 
visory from the WSN the agent sends the fault 
to the Fault Resolution Agent 
OR 

if the WSN Monitoring Agent receives engineer- 
ing data from the WSN the agent sends the 
data to the Trending Agent 

That part of the scenario could be mapped to structured text 
as: 

inW SNMA?fault from WSN 
then outWSNMA!fault to FIRE 
else 

inengWSNMA?data from WSN 
then outengWSNMAIdata to TREND 

The laws of concurrency would allow us to derive the traces 
as: 

tWSNMA D {(), {inWSNMA, fault), 

(inWSNMA, fault, outWSNMA, fault)} [J 
{{), ( inengWSNMA , data), 

(inengWSNMA, data , outWSNMA, data) 

From the traces, we can infer an equivalent CSP process 
specification as: 


WSNMA = inWSNMA ? fault -> 

{outWSNMA! fault -► SKIP) 

| {inengWSNMA? data — » outengWSNMAIdata — > 
SKIP) 

The R2D2C prototype tool will produce Java code from 
the CSP as follows: 

class WSNMonitoringAgent extends Thread { 

Transaction fault; 

Transaction faultadvisory; 
boolean running; 

public WSNMonitoringAgent ( . . . , 

Transaction fault. 

Transaction faultadvisory) { 

this. fault = fault; 

this . faultadvisory = faultadvisory;} 

class WSNMonitoringAgent extends Thread { 
Transaction data; 

Transaction engineeringdata; 

Transaction fault; 

Transaction faultadvisory; 
boolean running; 

public WSNMonitoringAgent (Transaction data, 
Transaction engineeringdata. 

Transaction fault. 

Transaction faultadvisory) { 
this. data - data; 

this . engineeringdata = engineeringdata; 
this. fault = fault; 

this . faultadvisory = faultadvisory; } 

public void run() { 
int index = 0; 
running = true; 

while (running) { 
switch (index) { 
case 0 : 

while ( fault advisory . committed () 

== false) ; 

Test .out .print In ("faultadvisory" ) ; 

Test . out . flush () ; 

while (fault . committed () == false); 

Test . out . println ( "fault " ) ; 

Test .out . flush ( ) ; 
break; 
case 1 : 

while (engineeringdata . committed ( ) 

== false) ; 

Test .out .println ("engineeringdata") ; 
Test . out . flush () ; 

while (data . committed () == false); 

Test . out .println ("data" ) ; 

Test . out . flush ( ) ; 
break; } 
index++; 
index; }}} 

5. Application to Sensor Networks 

NASA is currently conducting research and development 
on sensor networks for planetary and solar system explo- 


ration as well as to support its Mission to Planet Earth. 
In addition to the ANTS mission, a similar mission is be- 
ing considered to explore the rings of Saturn. Sensor net- 
works are also being considered for planetary (e.g., Mar- 
tian) exploration, to yield valuable scientific information on 
weather and geological aspects. For the Mission to Planet 
Earth, sensor networks are already being researched and de- 
veloped towards capabilities for early warnings about natu- 
ral disasters and climate change. With the system of systems 
nature of sensor networks, the inter-relatedness of these sys- 
tems all networked together will create a level of complex- 
ity that will require a new level of dependability and a cor- 
responding new approach to system and software develop- 
ment. 

Projected NASA sensor networks are highly distributed 
autonomous “systems of systems” that must operate with a 
high degree of reliability. The solar system and planetary 
exploration networks will necessarily experience long com- 
munications delays with Earth, will partly and occasionally 
be out of touch with the Earth and mission control for long 
periods of time, and must operate under extremes of dy- 
namic environmental conditions. Due to the complexity of 
these systems as well as their distributed and parallel nature, 
they will have an extremely large state space and will be 
impossible to test completely using traditional testing tech- 
niques. The more “code” or instructions that can be gener- 
ated automatically from a verifiably correct model, the less 
likely that human developers will introduce errors. In addi- 
tion, the higher the level of abstraction that developers can 
work from, as is afforded through the use of scenarios to de- 
scribe system behavior, the less likely that a mismatch will 
occur between requirements and implementation and the 
more likely that the system can be validated. Working from 
a higher level of abstraction will also allow errors in the 
system to be more easily caught, since developers can bet- 
ter see the “big picture” of the system. In addition to allow- 
ing complex systems developers to work at a higher level of 
abstraction, R2D2C also converts the scenarios into a for- 
mal model that can be analyzed for concurrency-related er- 
rors and consistency and completeness, as well as domain- 
specific errors. 

6. Related Work 

Harel [7] [9] has advocated scenario-based programming 
through UML use cases and play-in scenarios. This work 
differs in that it uses scenarios in the form of structured 
text that is easily understandable by engineers and non- 
engineers. In addition, the results of converting the struc- 
tured text to traces and then from traces to a formal model 
allows us to use a wide range of formal methods tools (e.g., 
model checkers), which can be used to verify and validate 
the system [14]. 


NASA Ames has been working on the automatic trans- 
lation of UML use cases to executable code, and report suc- 
cess in using the approach on large applications [23]. Our 
approach is different, however, in that we are not limited to 
UML use cases, nor to natural language. R2D2C will work 
equally well with any input mechanism whereby require- 
ments can be represented as scenarios, and traces extracted. 
Our approach works equally well with graphical, mathemat- 
ical, and textual requirements representations. More impor- 
tantly, the key to our approach and what makes it invalu- 
able for high-dependability applications is the full formal 
basis, and complete mathematical tractability from require- 
ments through to code. To our knowledge, no other cur- 
rently available automated development methodology can 
make this claim [14]. 

7. Conclusions and Future Work 

R2D2C is a unique approach to the automatic derivation 
of ultra-high dependability systems. It is unique in that it 
supports fully (mathematically) tractable development from 
requirements elicitation through to automatic code genera- 
tion (and back again). While other approaches have sup- 
ported various subsets of the development lifecycle, there 
has been heretofore a ‘"jump” in deriving from the require- 
ments the formal model that is a prerequisite for sound au- 
tomatic code generation. Yet, R2D2C is a simple approach, 
combining techniques and notations that are well under- 
stood, well tried and tested, and trusted. The novelty of the 
approach, and the part of the approach that achieves conti- 
nuity in the development process, is the use of a theorem 
prover to reverse the laws of concurrency, and to achieve 
levels of inference that would be impossible for a human 
being to perform on all but trivial systems [14]. 

R2D2C (and other approaches that similarly provide 
mathematical soundness throughout the development life- 
cycle) will decrease costs and delays for the engineering 
(and re-engineering) of ultra-high dependability systems 
through automated development. Such technology will dra- 
matically increase assurance of system success by ensuring 
that requirements are complete and consistent, implementa- 
tions are true to the requirements, automatically coded sys- 
tems are bug-free, and implementation behavior is as ex- 
pected [14]. 

Future work will include improving the quality of the 
embedding of CSP in ACL2, and optimizing that for effi- 
ciency. We plan a plethora of support tools to allow us to 
easily change the level of abstraction in a formal model, to 
visualize various system models and changes in those mod- 
els, and to aid in tracking changes through the development 
process (or the reverse engineering process). We plan to en- 
hance our existing prototype to support the full version of 
R2D2C, to make it into a fully functional robust prototype. 



and to apply it to more significant examples than the one 
presented in this paper [14]. 
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